AITopics | imsart-ao ver

Collaborating Authors

imsart-ao ver

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptive novelty detection with false discovery rate guarantee

Marandon, Ariane, Lei, Lihua, Mary, David, Roquain, Etienne

arXiv.org Machine LearningOct-25-2023

This paper studies the semi-supervised novelty detection problem where a set of "typical" measurements is available to the researcher. Motivated by recent advances in multiple testing and conformal inference, we propose AdaDetect, a flexible method that is able to wrap around any probabilistic classification algorithm and control the false discovery rate (FDR) on detected novelties in finite samples without any distributional assumption other than exchangeability. In contrast to classical FDR-controlling procedures that are often committed to a pre-specified p-value function, AdaDetect learns the transformation in a data-adaptive manner to focus the power on the directions that distinguish between inliers and outliers. Inspired by the multiple testing literature, we further propose variants of AdaDetect that are adaptive to the proportion of nulls while maintaining the finite-sample FDR control. The methods are illustrated on synthetic datasets and real-world datasets, including an application in astrophysics.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2208.06685

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.50)
Research Report > New Finding (0.45)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Spatially Adaptive Online Prediction of Piecewise Regular Functions

Chatterjee, Sabyasachi, Goswami, Subhajit

arXiv.org Machine LearningMar-30-2022

We consider the problem of estimating piecewise regular functions in an online setting, i.e., the data arrive sequentially and at any round our task is to predict the value of the true function at the next revealed point using the available data from past predictions. We propose a suitably modified version of a recently developed online learning algorithm called the sleeping experts aggregation algorithm. We show that this estimator satisfies oracle risk bounds simultaneously for all local regions of the domain. As concrete instantiations of the expert aggregation algorithm proposed here, we study an online mean aggregation and an online linear regression aggregation algorithm where experts correspond to the set of dyadic subrectangles of the domain. The resulting algorithms are near linear time computable in the sample size. We specifically focus on the performance of these online algorithms in the context of estimating piecewise polynomial and bounded variation function classes in the fixed design setup. The simultaneous oracle risk bounds we obtain for these estimators in this context provide new and improved (in certain aspects) guarantees even in the batch setting and are not available for the state of the art batch learning estimators.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

2203.16587

Country:

North America > United States > Illinois > Champaign County > Champaign (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India > Maharashtra > Mumbai (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Rademacher upper bounds for cross-validation errors with an application to the lasso

Xu, Ning, Fisher, Timothy C. G., Hong, Jian

arXiv.org Machine LearningJul-30-2020

We establish a general upper bound for $K$-fold cross-validation ($K$-CV) errors that can be adapted to many $K$-CV-based estimators and learning algorithms. Based on Rademacher complexity of the model and the Orlicz-$\Psi_{\nu}$ norm of the error process, the CV error upper bound applies to both light-tail and heavy-tail error distributions. We also extend the CV error upper bound to $\beta$-mixing data using the technique of independent blocking. We provide a Python package (\texttt{CVbound}, \url{https://github.com/isaac2math}) for computing the CV error upper bound in $K$-CV-based algorithms. Using the lasso as an example, we demonstrate in simulations that the upper bounds are tight and stable across different parameter settings and random seeds. As well as accurately bounding the CV errors for the lasso, the minimizer of the new upper bounds can be used as a criterion for variable selection. Compared with the CV-error minimizer, simulations show that tuning the lasso penalty parameter according to the minimizer of the upper bound yields a more sparse and more stable model that retains all of the relevant variables.

cv error, imsart-ao ver, tex date, (14 more...)

arXiv.org Machine Learning

2007.15598

Country:

North America > United States > New York (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.62)

Add feedback

Batch Policy Learning in Average Reward Markov Decision Processes

Liao, Peng, Qi, Zhengling, Murphy, Susan

arXiv.org Machine LearningJul-22-2020

We study the problem of policy optimization in Markov Decision Process over infinite time horizons (Puterman, 1994). We focus on the batch (i.e., off-line) setting, where historical data of multiple trajectories has been previously collected using some behavior policy. Our goal is to learn a new policy with guaranteed performance when implemented in the future. In this work, we develop a data-efficient method to learn the policy that optimizes the long-term average reward in a pre-specified policy class from a training set composed of multiple trajectories. Furthermore, we establish a finite-sample regret guarantee, i.e., the difference between the average reward of the optimal policy in the class and the average reward of the estimated policy by our proposed method. This work is motivated by the development of justin-time adaptive intervention in mobile health (mHealth) applications (Nahum-Shani et al., 2017). Our method can be used to learn a treatment policy that maps the real-time collected information about the individual's status and context to a particular treatment at each of many decision times to support health behaviors.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2007.11771

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Add feedback

Minimax optimal approaches to the label shift problem

Maity, Subha, Sun, Yuekai, Banerjee, Moulinath

arXiv.org Machine LearningApr-4-2020

A key feature of intelligence is to transfer knowledge garnered from one task to another similar but different task. However, statistical learning has by and large been confined to procedures designed to learn from one particular task (through training data) and address the same task on new (test) data. This is inadequate for a wide range of real world applications where it is important to learn a new task, using the knowledge of a partially similar task which has already been learned. The field of transfer learning deals with these kinds of problems and has therefore attracted increasing attention in machine learning and its many varied applications. Recent applications includes computer vision [28, 10], speech recognition [14] and genre classification [5].

imsart-ao ver, label shift problem, tex date, (13 more...)

arXiv.org Machine Learning

2003.10443

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > New York (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Adaptive Estimation of Multivariate Piecewise Polynomials and Bounded Variation Functions by Optimal Decision Trees

Chatterjee, Sabyasachi, Goswami, Subhajit

arXiv.org Machine LearningNov-26-2019

Proposed by Donoho (1997), Dyadic CART is a nonparametric regression method which computes a globally optimal dyadic decision tree and fits piecewise constant functions. In this article we define and study Dyadic CART and a closely related estimator, namely Optimal Regression Tree (ORT), in the context of estimating piecewise smooth functions in general dimensions. More precisely, these optimal decision tree estimators fit piecewise polynomials of any given degree. Like Dyadic CART in two dimensions, we reason that these estimators can also be computed in polynomial time in the sample size via dynamic programming. We prove oracle inequalities for the finite sample risk of Dyadic CART and ORT which imply tight risk bounds for several function classes of interest. Firstly, they imply that the finite sample risk of ORT of order $r \geq 0$ is always bounded by $C k \frac{\log N}{N}$ ($N$ is the sample size) whenever the regression function is piecewise polynomial of degree $r$ on some reasonably regular axis aligned rectangular partition of the domain with at most $k$ rectangles. Beyond the univariate case, such guarantees are scarcely available in the literature for computationally efficient estimators. Secondly, our oracle inequalities uncover optimality and adaptivity of the Dyadic CART estimator for function spaces with bounded variation. We consider two function spaces of recent interest where multivariate total variation denoising and univariate trend filtering are the state of the art methods. We show that Dyadic CART enjoys certain advantages over these estimators while still maintaining all their known guarantees.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

1911.11562

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Add feedback

New Risk Bounds for 2D Total Variation Denoising

Chatterjee, Sabyasachi, Goswami, Subhajit

arXiv.org Machine LearningFeb-4-2019

2D Total Variation Denoising (TVD) is a widely used technique for image denoising. It is also an important non parametric regression method for estimating functions with heterogenous smoothness. Recent results have shown the TVD estimator to be nearly minimax rate optimal for the class of functions with bounded variation. In this paper, we complement these worst case guarantees by investigating the adaptivity of the TVD estimator to functions which are piecewise constant on axis aligned rectangles. We rigorously show that, when the truth is piecewise constant, the ideally tuned TVD estimator performs better than in the worst case. We also study the issue of choosing the tuning parameter. In particular, we propose a fully data driven version of the TVD estimator which enjoys similar worst case risk guarantees as the ideally tuned TVD estimator.

estimator, imsart-ao ver, lemma 6, (14 more...)

arXiv.org Machine Learning

1902.01215

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
Europe > France (0.04)

Genre:

Research Report (0.81)
Workflow (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Semi-supervised learning in unbalanced and heterogeneous networks

Li, Ting, Ying, Ningchen, Yu, Xianshi, Jing, Bin-Yi

arXiv.org Machine LearningJan-7-2019

Community detection was a hot topic on network analysis, where the main aim is to perform unsupervised learning or clustering in networks. Recently, semi-supervised learning has received increasing attention among researchers. In this paper, we propose a new algorithm, called weighted inverse Laplacian (WIL), for predicting labels in partially labeled networks. The idea comes from the first hitting time in random walk, and it also has nice explanations both in information propagation and the regularization framework. We propose a partially labeled degree-corrected block model (pDCBM) to describe the generation of partially labeled networks. We show that WIL ensures the misclassification rate is of order $O(\frac{1}{d})$ for the pDCBM with average degree $d=\Omega(\log n),$ and that it can handle situations with greater unbalanced than traditional Laplacian methods. WIL outperforms other state-of-the-art methods in most of our simulations and real datasets, especially in unbalanced networks and heterogeneous networks.

algorithm, imsart-ao ver, tex date, (12 more...)

arXiv.org Machine Learning

1901.01696

Country:

North America > United States (0.28)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Government > Voting & Elections (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Optimality and Sub-optimality of PCA I: Spiked Random Matrix Models

Perry, Amelia, Wein, Alexander S., Bandeira, Afonso S., Moitra, Ankur

arXiv.org Machine LearningJul-2-2018

A central problem of random matrix theory is to understand the eigenvalues of spiked random matrix models, introduced by Johnstone, in which a prominent eigenvector (or "spike") is planted into a random matrix. These distributions form natural statistical models for principal component analysis (PCA) problems throughout the sciences. Baik, Ben Arous and Peche showed that the spiked Wishart ensemble exhibits a sharp phase transition asymptotically: when the spike strength is above a critical threshold, it is possible to detect the presence of a spike based on the top eigenvalue, and below the threshold the top eigenvalue provides no information. Such results form the basis of our understanding of when PCA can detect a low-rank signal in the presence of noise. However, under structural assumptions on the spike, not all information is necessarily contained in the spectrum. We study the statistical limits of tests for the presence of a spike, including non-spectral tests. Our results leverage Le Cam's notion of contiguity, and include: i) For the Gaussian Wigner ensemble, we show that PCA achieves the optimal detection threshold for certain natural priors for the spike. ii) For any non-Gaussian Wigner ensemble, PCA is sub-optimal for detection. However, an efficient variant of PCA achieves the optimal threshold (for natural priors) by pre-transforming the matrix entries. iii) For the Gaussian Wishart ensemble, the PCA threshold is optimal for positive spikes (for natural priors) but this is not always the case for negative spikes.

artificial intelligence, machine learning, threshold, (16 more...)

arXiv.org Machine Learning

1807.00891

Country:

Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.24)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(3 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Boulevard: Regularized Stochastic Gradient Boosted Trees and Their Limiting Distribution

Zhou, Yichen, Hooker, Giles

arXiv.org Machine LearningJun-25-2018

This paper presents a theoretical study of gradient boosted trees (GBT: Friedman, 2001). Machine learning methods for prediction have generally been thought of as trading off both intelligibility and statistical uncertainty quantification in favor of accuracy. Recent results have started to provide a statistical understanding of methods based on ensembles of decision trees (Breiman et al., 1984). In particular, the consistency of methods related to Random Forests (RFs: Breiman, 2001) has been demonstrated in Biau (2012); Scornet et al. (2015) while Wager et al. (2014); Mentch and Hooker (2016); Wager and Athey (2017) and Athey et al. (2016) prove central limit theorems for RF predictions. These have then been used for tests of variable importance and nonparametric interactions in Mentch and Hooker (2017). In this paper, we extend this analysis to GBT. Analyses of RFs have relied on a subsampling structure to express the estimator in the form of a U-statistic from which central limit theorems can be derived. By contrast, GBT produces trees sequentially with the current tree depending on the values in those built previously, requiring a different analytical approach. While the algorithm proposed in Friedman (2001) is intended to be generally applicable to any loss function, in this paper we focus specifically on nonparametric regression (Stone, 1977, 1982).

artificial intelligence, imsart-ao ver, machine learning, (18 more...)

arXiv.org Machine Learning

1806.09762

Country: North America > United States > New York > Tompkins County > Ithaca (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.51)

Add feedback